In this part of the course, we will cover the following concepts:
| Objective | Complete |
|---|---|
| Formulate the process of using ggplot2 to build plots | |
| Build a histogram and a scatterplot with ggplot2 |
ggplot2ggplot2 is a tidyverse package used for creating graphics grammatically (i.e., “grammar of graphics,” or gg, plots)ggplot2 (cont’d)ggplot2 packageggplot2 package offers the following features:| Objective | Complete |
|---|---|
| Formulate the process of using ggplot2 to build plots |
✔ |
| Build a histogram and a scatterplot with ggplot2 |
ggplot2 to visualize a sample datasetbox package and encode your directory structure into variablesmain_dir be the variable corresponding to your materials folderdata directory inside of the materials folder in your environment, so we’ll save its path to a data_dir variableplots directory corresponding to plot_dir variablepaste0 command and pass the strings you would like to paste togetherStroke Dataset: attribute informationdata_dir into R’s environment'data.frame': 5110 obs. of 12 variables:
$ id : int 9046 51676 31112 60182 1665 56669 53882 10434 27419 60491 ...
$ gender : chr "Male" "Female" "Male" "Female" ...
$ age : num 67 61 80 49 79 81 74 69 59 78 ...
$ hypertension : int 0 0 0 0 1 0 1 0 0 0 ...
$ heart_disease : int 1 0 1 0 0 0 1 0 0 0 ...
$ ever_married : chr "Yes" "Yes" "Yes" "Yes" ...
$ work_type : chr "Private" "Self-employed" "Private" "Private" ...
$ Residence_type : chr "Urban" "Rural" "Rural" "Urban" ...
$ avg_glucose_level: num 229 202 106 171 174 ...
$ bmi : num 36.6 NA 32.5 34.4 24 29 27.4 22.8 NA 24.2 ...
$ smoking_status : chr "formerly smoked" "never smoked" "never smoked" "smokes" ...
$ stroke : int 1 1 1 1 1 1 1 1 1 1 ...
bmi column with the mean# Convert BMI to numeric
health_data$bmi <- as.numeric(health_data$bmi)
# Replace N/A's in BMI column with mean
health_data$bmi[is.na(health_data$bmi)] <- mean(health_data$bmi,na.rm=TRUE)'data.frame': 5110 obs. of 12 variables:
$ id : int 9046 51676 31112 60182 1665 56669 53882 10434 27419 60491 ...
$ gender : chr "Male" "Female" "Male" "Female" ...
$ age : num 67 61 80 49 79 81 74 69 59 78 ...
$ hypertension : int 0 0 0 0 1 0 1 0 0 0 ...
$ heart_disease : int 1 0 1 0 0 0 1 0 0 0 ...
$ ever_married : chr "Yes" "Yes" "Yes" "Yes" ...
$ work_type : chr "Private" "Self-employed" "Private" "Private" ...
$ Residence_type : chr "Urban" "Rural" "Rural" "Urban" ...
$ avg_glucose_level: num 229 202 106 171 174 ...
$ bmi : num 36.6 28.9 32.5 34.4 24 ...
$ smoking_status : chr "formerly smoked" "never smoked" "never smoked" "smokes" ...
$ stroke : int 1 1 1 1 1 1 1 1 1 1 ...
Suppose we only want to work on a few of these variables, not the entire dataset; we must restructure our data by taking a subset of the data with all observations of the following variables:
age variableavg_glucose_level variablebmi variableSince we know which columns we want to subset in the dataset, we can use their column names to extract them
'data.frame': 5110 obs. of 3 variables:
$ age : num 67 61 80 49 79 81 74 69 59 78 ...
$ avg_glucose_level: num 229 202 106 171 174 ...
$ bmi : num 36.6 28.9 32.5 34.4 24 ...
ggplot2ggplot2 is an external package, we need to install it firstggplot2 packageggplot2 package (cont’d)geom (i.e., a layer)geoms at the tidyverse websiteggplot2: setupageIn order to make a base plot we need to:
ggplot2: setupTranslated into ggplot2 syntax, this will require us to call a ggplot function and:
health_subset)Age to be plotted on x-axis)geom layer, in our case it will be a histogram (e.g. add geom_histogram to the base plot)geom_histogramggplot2ggplot2: adjustTo make the visualization prominent,we need to adjust:
ggplot2: adjust (cont’d)Depending on the plot and the final goal, you may want to:
geom_histogram: adjustggplot2: polishIn order to make visualization complete, we need to polish by adding more elements:
Depending on the plot and the final goal, you may want to:
geom_histogram: polishgeom_histogram: polish (cont’d)# Add a black and white theme to
# overwrite default.
ggp1 = ggp1 +
# Add a black and white theme.
theme_bw() +
# Customize elements of the theme.
theme(axis.title = element_text(size = 20),
axis.text = element_text(size = 16),
plot.title = element_text(size = 25),
plot.subtitle = element_text(size = 18))geom_point: set upThe histogram was a univariate visualization, so we only had to define a single axis
But for bivariate visualizations, like scatterplots, we need to define 2 axes:
x-axisy-axisIn ggplot terms, this means that we need to map 2 aes parameters (x and y respectively)
Let’s plot the relationship between age on x-axis and avg_glucose_level on y-axis
geom_point: set upgeom_pointgeom_pointgeom_pointgeom_point`geom_smooth()` using formula 'y ~ x'
theme_bw() +
theme(axis.title = element_text(size = 20),
axis.text = element_text(size = 16),
plot.title = element_text(size = 25),
plot.subtitle = element_text(size = 18))List of 93
$ line :List of 6
..$ colour : chr "black"
..$ size : num 0.5
..$ linetype : num 1
..$ lineend : chr "butt"
..$ arrow : logi FALSE
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_line" "element"
$ rect :List of 5
..$ fill : chr "white"
..$ colour : chr "black"
..$ size : num 0.5
..$ linetype : num 1
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_rect" "element"
$ text :List of 11
..$ family : chr ""
..$ face : chr "plain"
..$ colour : chr "black"
..$ size : num 11
..$ hjust : num 0.5
..$ vjust : num 0.5
..$ angle : num 0
..$ lineheight : num 0.9
..$ margin : 'margin' num [1:4] 0pt 0pt 0pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : logi FALSE
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ title : NULL
$ aspect.ratio : NULL
$ axis.title :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : num 20
..$ hjust : NULL
..$ vjust : NULL
..$ angle : NULL
..$ lineheight : NULL
..$ margin : NULL
..$ debug : NULL
..$ inherit.blank: logi FALSE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.title.x :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : num 1
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 2.75pt 0pt 0pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.title.x.top :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : num 0
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 0pt 2.75pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.title.x.bottom : NULL
$ axis.title.y :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : num 1
..$ angle : num 90
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 2.75pt 0pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.title.y.left : NULL
$ axis.title.y.right :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : num 0
..$ angle : num -90
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 0pt 0pt 2.75pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.text :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : chr "grey30"
..$ size : num 16
..$ hjust : NULL
..$ vjust : NULL
..$ angle : NULL
..$ lineheight : NULL
..$ margin : NULL
..$ debug : NULL
..$ inherit.blank: logi FALSE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.text.x :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : num 1
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 2.2pt 0pt 0pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.text.x.top :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : num 0
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 0pt 2.2pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.text.x.bottom : NULL
$ axis.text.y :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : num 1
..$ vjust : NULL
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 2.2pt 0pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.text.y.left : NULL
$ axis.text.y.right :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : num 0
..$ vjust : NULL
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 0pt 0pt 2.2pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ axis.ticks :List of 6
..$ colour : chr "grey20"
..$ size : NULL
..$ linetype : NULL
..$ lineend : NULL
..$ arrow : logi FALSE
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_line" "element"
$ axis.ticks.x : NULL
$ axis.ticks.x.top : NULL
$ axis.ticks.x.bottom : NULL
$ axis.ticks.y : NULL
$ axis.ticks.y.left : NULL
$ axis.ticks.y.right : NULL
$ axis.ticks.length : 'unit' num 2.75pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ axis.ticks.length.x : NULL
$ axis.ticks.length.x.top : NULL
$ axis.ticks.length.x.bottom: NULL
$ axis.ticks.length.y : NULL
$ axis.ticks.length.y.left : NULL
$ axis.ticks.length.y.right : NULL
$ axis.line : list()
..- attr(*, "class")= chr [1:2] "element_blank" "element"
$ axis.line.x : NULL
$ axis.line.x.top : NULL
$ axis.line.x.bottom : NULL
$ axis.line.y : NULL
$ axis.line.y.left : NULL
$ axis.line.y.right : NULL
$ legend.background :List of 5
..$ fill : NULL
..$ colour : logi NA
..$ size : NULL
..$ linetype : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_rect" "element"
$ legend.margin : 'margin' num [1:4] 5.5pt 5.5pt 5.5pt 5.5pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ legend.spacing : 'unit' num 11pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ legend.spacing.x : NULL
$ legend.spacing.y : NULL
$ legend.key :List of 5
..$ fill : chr "white"
..$ colour : logi NA
..$ size : NULL
..$ linetype : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_rect" "element"
$ legend.key.size : 'unit' num 1.2lines
..- attr(*, "valid.unit")= int 3
..- attr(*, "unit")= chr "lines"
$ legend.key.height : NULL
$ legend.key.width : NULL
$ legend.text :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : 'rel' num 0.8
..$ hjust : NULL
..$ vjust : NULL
..$ angle : NULL
..$ lineheight : NULL
..$ margin : NULL
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ legend.text.align : NULL
$ legend.title :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : num 0
..$ vjust : NULL
..$ angle : NULL
..$ lineheight : NULL
..$ margin : NULL
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ legend.title.align : NULL
$ legend.position : chr "right"
$ legend.direction : NULL
$ legend.justification : chr "center"
$ legend.box : NULL
$ legend.box.just : NULL
$ legend.box.margin : 'margin' num [1:4] 0cm 0cm 0cm 0cm
..- attr(*, "valid.unit")= int 1
..- attr(*, "unit")= chr "cm"
$ legend.box.background : list()
..- attr(*, "class")= chr [1:2] "element_blank" "element"
$ legend.box.spacing : 'unit' num 11pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ panel.background :List of 5
..$ fill : chr "white"
..$ colour : logi NA
..$ size : NULL
..$ linetype : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_rect" "element"
$ panel.border :List of 5
..$ fill : logi NA
..$ colour : chr "grey20"
..$ size : NULL
..$ linetype : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_rect" "element"
$ panel.spacing : 'unit' num 5.5pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ panel.spacing.x : NULL
$ panel.spacing.y : NULL
$ panel.grid :List of 6
..$ colour : chr "grey92"
..$ size : NULL
..$ linetype : NULL
..$ lineend : NULL
..$ arrow : logi FALSE
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_line" "element"
$ panel.grid.major : NULL
$ panel.grid.minor :List of 6
..$ colour : NULL
..$ size : 'rel' num 0.5
..$ linetype : NULL
..$ lineend : NULL
..$ arrow : logi FALSE
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_line" "element"
$ panel.grid.major.x : NULL
$ panel.grid.major.y : NULL
$ panel.grid.minor.x : NULL
$ panel.grid.minor.y : NULL
$ panel.ontop : logi FALSE
$ plot.background :List of 5
..$ fill : NULL
..$ colour : chr "white"
..$ size : NULL
..$ linetype : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_rect" "element"
$ plot.title :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : num 25
..$ hjust : num 0
..$ vjust : num 1
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 0pt 5.5pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi FALSE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ plot.title.position : chr "panel"
$ plot.subtitle :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : num 18
..$ hjust : num 0
..$ vjust : num 1
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 0pt 0pt 5.5pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi FALSE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ plot.caption :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : 'rel' num 0.8
..$ hjust : num 1
..$ vjust : num 1
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 5.5pt 0pt 0pt 0pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ plot.caption.position : chr "panel"
$ plot.tag :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : 'rel' num 1.2
..$ hjust : num 0.5
..$ vjust : num 0.5
..$ angle : NULL
..$ lineheight : NULL
..$ margin : NULL
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ plot.tag.position : chr "topleft"
$ plot.margin : 'margin' num [1:4] 5.5pt 5.5pt 5.5pt 5.5pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ strip.background :List of 5
..$ fill : chr "grey85"
..$ colour : chr "grey20"
..$ size : NULL
..$ linetype : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_rect" "element"
$ strip.background.x : NULL
$ strip.background.y : NULL
$ strip.placement : chr "inside"
$ strip.text :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : chr "grey10"
..$ size : 'rel' num 0.8
..$ hjust : NULL
..$ vjust : NULL
..$ angle : NULL
..$ lineheight : NULL
..$ margin : 'margin' num [1:4] 4.4pt 4.4pt 4.4pt 4.4pt
.. ..- attr(*, "valid.unit")= int 8
.. ..- attr(*, "unit")= chr "pt"
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ strip.text.x : NULL
$ strip.text.y :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : NULL
..$ angle : num -90
..$ lineheight : NULL
..$ margin : NULL
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
$ strip.switch.pad.grid : 'unit' num 2.75pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ strip.switch.pad.wrap : 'unit' num 2.75pt
..- attr(*, "valid.unit")= int 8
..- attr(*, "unit")= chr "pt"
$ strip.text.y.left :List of 11
..$ family : NULL
..$ face : NULL
..$ colour : NULL
..$ size : NULL
..$ hjust : NULL
..$ vjust : NULL
..$ angle : num 90
..$ lineheight : NULL
..$ margin : NULL
..$ debug : NULL
..$ inherit.blank: logi TRUE
..- attr(*, "class")= chr [1:2] "element_text" "element"
- attr(*, "class")= chr [1:2] "theme" "gg"
- attr(*, "complete")= logi TRUE
- attr(*, "validate")= logi TRUE
ggplot| Objective | Complete |
|---|---|
| Formulate the process of using ggplot2 to build plots |
✔ |
| Build a histogram and a scatterplot with ggplot2 |
✔ |
You are now ready to try tasks 1-7 in the Exercise for this topic